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EXAMINER'S AMENDMENT 

This action is responsive to the following communication: Request for continued 
Examination, filed 11/10/2008. 

An examiner's amendment to the record appears below. Should the changes 
and/or additions be unacceptable to applicant, an amendment may be filed as provided 
by 37 CFR 1 .312. To ensure consideration of such an amendment, it MUST be 
submitted no later than the payment of the issue fee. 

Authorization for this examiner's amendment was given in a telephone interview 
with the Applicants' representative, Anthony F. Bonner on Thursday Jan. 15 and 
Thursday Jan. 22, 2009 (confirmation message). 

The application has been amended as follows: 

Claim 1 . (Currently Amended) A method comprising: 

training an email system for determining spam, where training includes at least 
the following: 

r e c ei v i ng and e ma il m e ssag e hav i ng a word; 
retrieving a first email message : 

generating a phonetic equivalent of the at least one word from a body portion of 
the email message; 

tokenizing the phonetic equivalent of the word to generate a token representative 
of the phonetic equivalent; 
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tokenizing at least one word in a subject line of the first email message: 
tokenizing at least one simple mail transfer protocol (SMTP) email address associated 
with the first email message: 

tokenizing at least one domain name associated with the first email message: 

tokenizing at least one attachment of the first email message, wherein tokenizing 
the at least one attachment includes in generating a 128-bit MD5 hash of the 
attachment, appending a 32-bit length of the attachment to the, generated MD5 hash 
resulting in a 160-bit number, and UUencoding the resulting 160-bit number : 

determining a spam probability from the generated tekeft tokens ; 

in response to dotorm i n i ng a determination that the spam probability from the 
generated tekef h tokens indicates that the first email message is likely spam: 

determining whether the, generated tokens are present in a database of 

tokens : 

in response to a determination that at least one of the, generated tokens is 
not present in the database of tokens , assigning wh e th e r th e tok e n e x i sts i n a 
probability value for each token as spam and adding the token and assigned 
probability value to the database of tokens; and 

in response to d e t e rm i n i ng a determination that the token exits is present 
in the database of tokens, updating a probability value of the token; and 

in response to dotorm i n i ng a determination that the spam probability from 
the generated tok e ns, tokens, indicates that the first email message is not likely 
spam: 
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determining whether the generated tokens are present in a database of 
tokens : 

in response to a determination that at least one of the, generated tokens is 
not present in the database of tokens, assigning a probability value for each 
token indicative of non-spam and adding the token and assigned probability 
value to the database of tokens: and 

i n r e spons e to d e t e rm i n i ng that the tok e n do e s not e x i st i n th e databas e of 
tokens, ass i gn i ng a probab ili ty va l uo indicat i ve of spam to tho token. 

in response to a determination that the token is present in the database of 
tokens, updating a probability value of the token: sorting the generated tokens in 
accordance with the corresponding determined spam probability value: and 
filtering a second email message according to the training. 

2. (Previously Presented) The method of claim 1 , wherein generating the 
phonetic equivalent of the word comprises: 

identifying a string of characters, the string of characters including a non- 
alphabetic character; and removing the non-alphabetic character from the string 
of characters. 

3. (Previously Presented) The method of claim 2, wherein removing the 
non- alphabetic character comprises: 

locating a non-alphabetic character within the string of characters, the 
non-alphabetic character being at least one selected from the group consisting 
of: 
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" (quote); 

' (single quote); 

! (exclamation mark); 

@ (at); # (pound); $ (dollar); 
% (percent); 
A (caret); 
& (ampersand); 
* (asterisk); 
( (open parenthesis); 
) (close parenthesis); 
_ (underscore); 
- (hyphen); 
+ (plus); 
= (equal); 
\ (backslash); 
/ (slash); 

? (question mark); 
(space); 
(tab); 

[ (open square bracket); 
] (close square bracket); 
{ (open bracket); 
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} (close bracket); 

< (less than); 

> (greater than); 

, (comma); 

:(colon); 

;(semi-colon); and 
. (period). 

4. (Previously Presented) The method of claim 1 , wherein determining the 
spam probability comprises: 

assigning a spam probability value to the token; and 
generating a Bayesian probability value using the spam probability value 
assigned to the token. 

5. (Previously Presented) The method of claim 4, wherein determining the 
spam probability further comprises: comparing the generated Bayesian 
probability value with a predefined threshold value. 

6. (Previously Presented) The method of claim 5, wherein determining the 
spam probability further comprises: categorizing the email message as spam in 
response to the Bayesian probability value being greater than the predefined 
threshold. 

7. (Previously Presented) The method of claim 5, wherein determining the 
spam probability further comprises: categorizing the email message as non-spam 
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in response to the Bayesian probability value being not greater than the 
predefined threshold. 

8. (Currently Amended) A training email system for determining spam on a 
computer storage medium comprising: 

means for receiving an email message having a wefdi word and an 
attachment : 

means for generating a phonetic equivalent of the at least one word from a 
body portion of the email message; 

means for tokenizing the phonetic equivalent of the word to generate a 
token representative of the phonetic equivalent; 

means for tokenizing at least one word in a subject line of the first email 
message; 

means for tokenizing at least one word in a subject line of the first email 
message: tokenizing at least one simple mail transfer protocol (SMTP) email address 
associated with the first email message; 

means for tokenizing at least one domain name associated with the first email 
message; 

means for tokenizing at least one attachment of the first email message, wherein 
tokenizing the at least one attachment includes in generating a 128-bit MD5 hash of the 
attachment, appending a 32-bit length of the attachment to the, generated MD5 hash 
resulting in a 160-bit number, and UUencoding the resulting 160-bit number ; 

means for determining a spam probability from the generated tokens : 
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in response to a determination that the spam probability from the generated 
tokens, means for indicating that the first email message is likely spam: 

means for determining whether the generated tokens are present in a 
database of tokens : 

in response to a determination that at least one of the, generated tokens is 
not present in the database of tokens, means for assigning a probability value for 
each token as spam and adding the token and assigned probability value to the 
database of tokens: and 

in response to a determination that the token is present in the database of 
tokens, means for updating a probability value of the token: and 

in response to a determination that the spam probability from the 
generated tokens, means for indicating that the first email message is not likely 
spam : 

determining whether the . generated tokens are present in a 
database of tokens; 

in response to a determination that at least one of the, generated 
tokens is not present in the database of tokens, assigning a probability 
value for each token indicative of non-spam and adding the token and 
assigned probability value to the database of tokens; and 
a moans for tokon i z i ng tho attachment; 

m e ans for d e t e rm i n i ng a spam probab ili ty from th e g e n e rat e d tok e n; and 
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m e ans for sort i ng th e g e n e rat e d tok e ns i n accordanc e w i th th e 

corr e spond i ng d e t e rm i n e d spam probab il ity va l u e - 
in response to a determination that the token is present in the database of 

tokens, updating a probability value of the token: sorting the generated tokens in 

accordance with the corresponding determined spam probability value: and 

filtering a second email message according to the training. 

9. (Currently Amended) A system comprising: 

a processor; and 

a memory, the memory storing: 

receive logic configured to receive an email message having a 
wefd^ word and an attachment : 

phonetic logic configured to generate a phonetic equivalent of the 
word from the email message; 

first tokenize logic configured to tokenize the phonetic equivalent of 
the word to generate a token representative of the phonetic equivalent; 
and 

second tokenize logic configured to tokenize the attachment; 

tokenizing at least one word in a subject line of the first email message; 
tokenizing at least one simple mail transfer protocol (SMTP) email address associated 
with the first email message; 

tokenizing at least one domain name associated with the first email message: 
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tokenizing at least one attachment of the first email message, wherein tokenizing 
the at least one attachment includes in generating a 128-bit MD5 hash of the 
attachment, appending a 32-bit length of the attachment to the, generated MD5 hash 
resulting in a 160-bit number, and UUencoding the resulting 160-bit number : 

determining a spam probability from the generated tokens: 

in response to a determination that the spam probability from the generated 
tokens indicates that the first email message is likely spam: 

determining whether the, generated tokens are present in a database of 

tokens : 

in response to a determination that at least one of the generated tokens is 
not present in the database of tokens, assigning a probability value for each 
token as spam and adding the token and assigned probability value to the 
database of tokens: and 

in response to a determination that the token is present in the database of 
tokens, updating a probability value of the token: and 

in response to a determination that the spam probability from the 
generated tokens, indicates that the first email message is not likely spam: 

determining whether the generated tokens are present in a database of 
tokens : 

in response to a determination that at least one of the, generated tokens is 
not present in the database of tokens, assigning a probability value for each 



Application/Control Number: 10/685,558 
Art Unit: 2163 



Page 1 1 



token indicative of non-spam and adding the token and assigned probability 
value to the database of tokens: and 

in response to a determination that the token is present in the database of 
tokens, updating a probability value of the token: sorting the, generated tokens in 
accordance with the corresponding determined spam probability value: and 
filtering a second email message according to the training- 
spam d e t e rm i nat i on log i c conf i gur e d to d e t e rm i n e a spam 

probab ili ty from tho gonoratod tokens; and sort i ng l og i c conf i gured to sort 

th e g e n e rat e d tok e ns i n accordanc e w i th th e corr e spond i ng d e t e rm i n e d 

spam probab ili ty va l ue. 

10. (Previously Presented) The system of claim 9, the memory further 
storing: 

string-identification logic configured to identify a string of characters, the 
string of characters including a non-alphabetic character; and 

character-removal logic configured to remove the non-alphabetic 
character from the string of characters. 

1 1 . (Previously Presented) The system of claim 10, the memory further 
storing: spam-probability logic configured to assign a spam probability value to 
the token; and Bayesian logic configured to generate a Bayesian probability 
value using the spam probability value assigned to the token. 
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12. (Previously Presented) The system of claim 1 1 , the memory further 
storing: compare logic configured to compare the generated Bayesian probability 
value with a predefined threshold value. 

13. (Previously Presented) The system of claim 12, the memory further 
storing: spam-categorization logic configured to categorize the email message as 
spam in response to the Bayesian probability value being greater than the 
predefined threshold. 

14. (Previously Presented) The system of claim 12, the memory further 
storing: spam-categorization logic configured to categorize the email message as 
non-spam in response to the Bayesian probability value being not greater than 
the predefined threshold. 

15. (Currently Amended) A computer-readable medium that includes a 
program that, when executed by a computer, causes the computer to perform at 
least the following: 

a proc e ssor; and a m e mory, th e m e mory stor i ng: 
computor roadab l o codo adapted to i nstruct a programmab l e dev i ce to 
receive an email message having a word and an attachment; 
comput e r r e adab le cod e adapt e d to i nstruct a programmab le d e v i c e to 
generate a phonetic equivalent of the word from the email message; 
computor roadab l o codo adapted to i nstruct a programmab l e dev i ce to 
tokenize the phonetic equivalent of the word to generate a token 
representative of the phonetic equivalent; 
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tokenize the attachment; 

generate a phonetic equivalent of at least one word from a body portion of 
the email message: 

tokenize the phonetic equivalent of the word to generate a token 
representative of the phonetic equivalent : 

tokenize at least one word in a subject line of the first email message: 
tokenizing at least one simple mail transfer protocol (SMTP) email address 
associated with the first email message; 

tokenize at least one domain name associated with the first email 
message; 

tokenize at least one attachment of the first email message, wherein 
tokenizing the at least one attachment includes in generating a 128-bit MD5 hash 
of the attachment, appending a 32-bit length of the attachment to the, generated 
MD5 hash resulting in a 160-bit number, and UUencoding the resulting 160-bit 
number ; 

determine a spam probability from the generated tokens: 

in response to a determination that the spam probability from the 
generated tokens, indicate that the first email message is likely 
spam: 

determine whether the, generated tokens are present in a 
database of tokens; 
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in response to a determination that at least one of the, 
generated tokens is not present in the database of tokens , 
assigning a probability value for each token as spam and adding 
the token and assigned probability value to the database of tokens: 
and 

in response to a determination that the token is present in the 
database of tokens, updating a probability value of the token: and 
in response to a determination that the spam probability from the 

generated tokens, indicates that the first email message is not likely spam: 

determining whether the generated tokens are present in a database of 

tokens : 

in response to a determination that at least one of the, generated tokens is 
not present in the database of tokens, assigning a probability value for each 
token indicative of non-spam and adding the token and assigned probability 
value to the database of tokens: and 

in response to a determination that the token is present in the database of 
tokens, update a probability value of the token: 

sort the generated tokens in accordance with the corresponding 
determined spam probability value: and filter a second email message according 
to the training- 
computer readab l e code adapted to i nstruct a programmab l e dov i co to 
determ i ne a spam probab ili ty from the generated token; and 
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sort th e g e n e rat e d tok e ns i n accordanc e w i th th e corr e spond i ng 
dotorm i nod spam probab ili ty va l ue . 



16. (Currently Amended) The computer-readable medium of claim 15, the 
m e mory furth e r stor i ng: the program further causing the computer to perform at 
least the following: 

computer readable code adapted to instruct a programmable device to 
identify a string of characters, the string of characters including a non-alphabetic 
character; and 

computer readab l e codo adapted to i nstruct a programmab l e dov i co to 
remove the non- alphabetic character from the string of characters. 

17. (Currently Amended) The computer-readable medium of claim 15, the 
m e mory furth e r stor i ng : the program further causing the computer to perform at 
least the following: computer readab l e codo adapted to i nstruct a programmab l e 
d e v i c e to assign a spam probability value to the token; and comput e r r e adab le 
codo adapted to i nstruct a programmable dov i co to generate a Bayesian 
probability value using the spam probability value assigned to the token. 

18. (Currently Amended) The computer-readable medium of claim 17, the 
m e mory furth e r stor i ng: the program further causing the computer to perform at 
least the following: computer readable code adapted to instruct a programmable 
device to compare the generated Bayesian probability value with a predefined 
threshold value. 
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19. (Currently Amended) The computer-readable medium of claim 18, toe 
momorv further stor i ng: the program further causing the computer to perform at 
least the following: comput e r r e adab le cod e adapt e d to i nstruct a programmab le 
dov i co to categorize the email message as spam in response to the Bayesian 
probability value being greater than the predefined threshold. 

20. (Currently Amended) The computer-readable medium of claim 18, the 
m e mory furth e r stor i ng: the program further causing the computer to perform at 
least the following: computer roadablo code adopted to i nstruct o programmab l e 
d e v i c e to categorize the email message as non-spam in response to the 
Bayesian probability value being not greater than the predefined threshold. 

Allowable Subject Matter 
Claims 1-20 are allowed. 

The following is an examiner's statement of reasons for allowance: Independent 
claims 1,8,9 and 1 5, when considered as a whole, are allowable over the prior arts of 
records. Specifically, prior arts of records fail to clearly teach or fairly suggest the 
combination of the following limitations: 

• generating a phonetic equivalent of at least one word from a body portion of the 
email message; 

• tokenizing the phonetic equivalent of the word to generate a token representative 
of the phonetic equivalent; 
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• tokenizing at least one word in a subject line of the first email message; 
tokenizing at least one simple mail transfer protocol (SMTP) email address 
associated with the first email message; 

• tokenizing at least one domain name associated with the first email message; 

• tokenizing at least one attachment of the first email message, wherein tokenizing 
the at least one attachment includes in generating a 128-bit MD5 hash of the 
attachment, appending a 32-bit length of the attachment to the, generated MD5 
hash resulting in a 160-bit number, and UUencoding the resulting 160-bit 
number; 

• determining a spam probability from the generated tokens; 

• in response to a determination that the spam probability from the generated 
tokens, indicates that the first email message is likely spam: 

■ determining whether the, generated tokens are present in a 
database of tokens; 

■ in response to a determination that at least one of the, generated 
tokens is not present in the database of tokens, assigning a 
probability value for each token as spam and adding the token and 
assigned probability value to the database of tokens; and 

■ in response to a determination that the token is present in the 
database of tokens, updating a probability value of the token; and 
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■ in response to a determination that the spam probability from the 
generated tokens, indicates that the first email message is not likely 
spam: 

■ determining whether the generated tokens are present in a 
database of tokens; 

■ in response to a determination that at least one of the, generated 
tokens is not present in the database of tokens, assigning a 
probability value for each token indicative of non-spam and adding 
the token and assigned probability value to the database of tokens; 
and 

■ in response to a determination that the token is present in the 
database of tokens, updating a probability value of the token; 
sorting the generated tokens in accordance with the corresponding 
determined spam probability value; and filtering a second email 
message according to the training. 

The dependent claims 2-7, 10-14 and 16-20, further add limitations to the 
allowable subject matter of the corresponding independent claims; thus they are also 
allowable. 

Any comments considered necessary by applicant must be submitted no later 
than the payment of the issue fee and, to avoid processing delays, should preferably 
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accompany the issue fee. Such submissions should be clearly labeled "Comments on 
Statement of Reasons for Allowance." 

Inquiry 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to TUAN-KHANH PHAN whose telephone number is 
(571)270-3047. The examiner can normally be reached on 4/5/9. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Don Wong can be reached on 571-272-1834. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
IT. K. P.I 

Examiner, Art Unit 2163 
/don wong/ 

Supervisory Patent Examiner, Art Unit 2163 



